Before jumping right into Time Series Ananlysis , lets first understand what is Time Series Data.
In other words, time is a crucial variable because it shows how the data adjusts over the course of the data points as well as the final results. It provides an additional source of information and a set order of dependencies between the data.
Time Series Data have 2 types:
1 Measurements gathered at regular time intervals (metrics) 2 Measurements gathered at irregular time intervals (events)
Now that we have understood what Time Series data means .. lets understand what is Time Series analysis?
Some of the models of time series analysis include -
Classification: It identifies and assigns categories to the data.
Curve Fitting: It plots data on a curve to investigate the relationships between variables in the data.
Descriptive Analysis: Patterns in time-series data, such as trends, cycles, and seasonal variation, are identified.
Explanative analysis: It attempts to comprehend the data and the relationships between it and cause and effect.
Segmentation: It splits the data into segments to reveal the source data's underlying properties.
import pandas as pd
from datetime import timedelta
import numpy as np
import tensorflow as tf
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error
from matplotlib import pyplot as plt
from typing import List
import math
import time
import plotly.express as px
import seaborn as sns
# stats tools
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from tensorflow.keras import Sequential
pip install yfinance
Requirement already satisfied: yfinance in d:\anaconda\lib\site-packages (0.2.10) Requirement already satisfied: pandas>=1.3.0 in d:\anaconda\lib\site-packages (from yfinance) (1.5.2) Requirement already satisfied: requests>=2.26 in d:\anaconda\lib\site-packages (from yfinance) (2.28.1) Requirement already satisfied: lxml>=4.9.1 in d:\anaconda\lib\site-packages (from yfinance) (4.9.1) Requirement already satisfied: multitasking>=0.0.7 in d:\anaconda\lib\site-packages (from yfinance) (0.0.11) Requirement already satisfied: cryptography>=3.3.2 in d:\anaconda\lib\site-packages (from yfinance) (38.0.4) Requirement already satisfied: html5lib>=1.1 in d:\anaconda\lib\site-packages (from yfinance) (1.1) Requirement already satisfied: frozendict>=2.3.4 in d:\anaconda\lib\site-packages (from yfinance) (2.3.4) Requirement already satisfied: pytz>=2022.5 in d:\anaconda\lib\site-packages (from yfinance) (2022.7.1) Requirement already satisfied: numpy>=1.16.5 in d:\anaconda\lib\site-packages (from yfinance) (1.23.5) Requirement already satisfied: appdirs>=1.4.4 in d:\anaconda\lib\site-packages (from yfinance) (1.4.4) Requirement already satisfied: beautifulsoup4>=4.11.1 in d:\anaconda\lib\site-packages (from yfinance) (4.11.1) Requirement already satisfied: soupsieve>1.2 in d:\anaconda\lib\site-packages (from beautifulsoup4>=4.11.1->yfinance) (2.3.2.post1) Requirement already satisfied: cffi>=1.12 in d:\anaconda\lib\site-packages (from cryptography>=3.3.2->yfinance) (1.15.1) Requirement already satisfied: webencodings in d:\anaconda\lib\site-packages (from html5lib>=1.1->yfinance) (0.5.1) Requirement already satisfied: six>=1.9 in d:\anaconda\lib\site-packages (from html5lib>=1.1->yfinance) (1.16.0) Requirement already satisfied: python-dateutil>=2.8.1 in d:\anaconda\lib\site-packages (from pandas>=1.3.0->yfinance) (2.8.2) Requirement already satisfied: certifi>=2017.4.17 in d:\anaconda\lib\site-packages (from requests>=2.26->yfinance) (2022.12.7) Requirement already satisfied: idna<4,>=2.5 in d:\anaconda\lib\site-packages (from requests>=2.26->yfinance) (3.4) Requirement already satisfied: urllib3<1.27,>=1.21.1 in d:\anaconda\lib\site-packages (from requests>=2.26->yfinance) (1.26.14) Requirement already satisfied: charset-normalizer<3,>=2 in d:\anaconda\lib\site-packages (from requests>=2.26->yfinance) (2.0.4) Requirement already satisfied: pycparser in d:\anaconda\lib\site-packages (from cffi>=1.12->cryptography>=3.3.2->yfinance) (2.21) Note: you may need to restart the kernel to use updated packages.
import yfinance as yf #https://pypi.org/project/yfinance/ - finance API
data = yf.download("AAPL", start="2000-01-01", end="2023-02-10")
[*********************100%***********************] 1 of 1 completed
data.head()
| Open | High | Low | Close | Adj Close | Volume | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2000-01-03 | 0.936384 | 1.004464 | 0.907924 | 0.999442 | 0.850643 | 535796800 |
| 2000-01-04 | 0.966518 | 0.987723 | 0.903460 | 0.915179 | 0.778926 | 512377600 |
| 2000-01-05 | 0.926339 | 0.987165 | 0.919643 | 0.928571 | 0.790324 | 778321600 |
| 2000-01-06 | 0.947545 | 0.955357 | 0.848214 | 0.848214 | 0.721931 | 767972800 |
| 2000-01-07 | 0.861607 | 0.901786 | 0.852679 | 0.888393 | 0.756128 | 460734400 |
Volume - Volume is the total number of shares traded in a security period. ### Why is a Stock’s Closing Price Significant?
Stock’s closing price determines how a share performs during the day.
data.shape
(5814, 6)
data.isna().sum() #check None meaning
Open 0 High 0 Low 0 Close 0 Adj Close 0 Volume 0 dtype: int64
data.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 5814 entries, 2000-01-03 to 2023-02-09 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Open 5814 non-null float64 1 High 5814 non-null float64 2 Low 5814 non-null float64 3 Close 5814 non-null float64 4 Adj Close 5814 non-null float64 5 Volume 5814 non-null int64 dtypes: float64(5), int64(1) memory usage: 318.0 KB
data = data.reset_index() #add index
data
| Date | Open | High | Low | Close | Adj Close | Volume | |
|---|---|---|---|---|---|---|---|
| 0 | 2000-01-03 | 0.936384 | 1.004464 | 0.907924 | 0.999442 | 0.850643 | 535796800 |
| 1 | 2000-01-04 | 0.966518 | 0.987723 | 0.903460 | 0.915179 | 0.778926 | 512377600 |
| 2 | 2000-01-05 | 0.926339 | 0.987165 | 0.919643 | 0.928571 | 0.790324 | 778321600 |
| 3 | 2000-01-06 | 0.947545 | 0.955357 | 0.848214 | 0.848214 | 0.721931 | 767972800 |
| 4 | 2000-01-07 | 0.861607 | 0.901786 | 0.852679 | 0.888393 | 0.756128 | 460734400 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 5809 | 2023-02-03 | 148.029999 | 157.380005 | 147.830002 | 154.500000 | 154.264465 | 154279900 |
| 5810 | 2023-02-06 | 152.570007 | 153.100006 | 150.779999 | 151.729996 | 151.498688 | 69858300 |
| 5811 | 2023-02-07 | 150.639999 | 155.229996 | 150.639999 | 154.649994 | 154.414230 | 83322600 |
| 5812 | 2023-02-08 | 153.880005 | 154.580002 | 151.169998 | 151.919998 | 151.688400 | 64120100 |
| 5813 | 2023-02-09 | 153.779999 | 154.330002 | 150.419998 | 150.869995 | 150.639999 | 56007100 |
5814 rows × 7 columns
data = data.set_index(data['Date']).sort_index()
data.columns
Index(['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')
data["Date"].min(), data["Date"].max() #see min and max means
(Timestamp('2000-01-03 00:00:00'), Timestamp('2023-02-09 00:00:00'))
data.plot(x="Date", y="Open", figsize=(8,5))
<AxesSubplot: xlabel='Date'>
data.plot(x="Date", y="Close", figsize=(8,5))
<AxesSubplot: xlabel='Date'>
data.plot(x="Date", y="High", figsize=(8,5))
<AxesSubplot: xlabel='Date'>
data.plot(x="Date", y="Low", figsize=(8,5))
<AxesSubplot: xlabel='Date'>
sns.kdeplot(data['Close'], fill=True)
<AxesSubplot: xlabel='Close', ylabel='Density'>
data[["Open", "High","Low","Close"]].corr()
| Open | High | Low | Close | |
|---|---|---|---|---|
| Open | 1.000000 | 0.999924 | 0.999907 | 0.999802 |
| High | 0.999924 | 1.000000 | 0.999892 | 0.999908 |
| Low | 0.999907 | 0.999892 | 1.000000 | 0.999910 |
| Close | 0.999802 | 0.999908 | 0.999910 | 1.000000 |
Meaning have strong correlation among the columns, so we can use only one meaning for predict or use MGK.
# As mentioned earlier "When researching historical stock price data,use the closing price as the standard measure of the stock’s value"
# so let's try visualising the close price of the dataset using plotly
fig = px.line(data,x="Date",y="Close",title="Closing Price: Range Slider and Selectors")
fig.update_xaxes(rangeslider_visible=True,rangeselector=dict(
buttons=list([
dict(count=1,label="1m",step="month",stepmode="backward"),
dict(count=6,label="6m",step="month",stepmode="backward"),
dict(count=1,label="YTD",step="year",stepmode="todate"),
dict(count=1,label="1y",step="year",stepmode="backward"),
dict(step="all")
])))
series = data['Close']
result = seasonal_decompose(series, model='additive',period=1) # The frequncy is daily
figure = result.plot()
We have trend, but don't have dailyseasonal
series = data['Close']
result = seasonal_decompose(series, model='additive', period=365) # The frequncy is daily
figure = result.plot()
We have trend and year seasonal/ And we see, that we have noises.
data_10y = data[data["Date"] > data["Date"].max() - timedelta(days=365*10)] #cut dataset, that old meaning don't influence on predict
data_10y["Date"].min(), data_10y["Date"].max()
(Timestamp('2013-02-12 00:00:00'), Timestamp('2023-02-09 00:00:00'))
data_10y.shape
(2517, 7)
sns.kdeplot(data_10y['Close'], fill=True)
<AxesSubplot: xlabel='Close', ylabel='Density'>
train_size = int(data_10y.shape[0]*0.8) #count meaning in train dataset
train_data = data_10y[:train_size]
validate_data = data_10y[train_size:]
data_10y.shape, train_data.shape, validate_data.shape
((2517, 7), (2013, 7), (504, 7))
train_data
| Date | Open | High | Low | Close | Adj Close | Volume | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2013-02-12 | 2013-02-12 | 17.125357 | 17.227858 | 16.705000 | 16.710714 | 14.432729 | 609053200 |
| 2013-02-13 | 2013-02-13 | 16.686071 | 16.915714 | 16.543571 | 16.678928 | 14.405277 | 475207600 |
| 2013-02-14 | 2013-02-14 | 16.590000 | 16.844286 | 16.572144 | 16.663929 | 14.392324 | 355275200 |
| 2013-02-15 | 2013-02-15 | 16.744642 | 16.791430 | 16.425714 | 16.434286 | 14.193984 | 391745200 |
| 2013-02-19 | 2013-02-19 | 16.467857 | 16.526072 | 16.208929 | 16.428213 | 14.188734 | 435783600 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-02-03 | 2021-02-03 | 135.759995 | 135.770004 | 133.610001 | 133.940002 | 132.149460 | 89880900 |
| 2021-02-04 | 2021-02-04 | 136.300003 | 137.399994 | 134.589996 | 137.389999 | 135.553329 | 84183100 |
| 2021-02-05 | 2021-02-05 | 137.350006 | 137.419998 | 135.860001 | 136.759995 | 135.133377 | 75693800 |
| 2021-02-08 | 2021-02-08 | 136.029999 | 136.960007 | 134.919998 | 136.910004 | 135.281570 | 71297200 |
| 2021-02-09 | 2021-02-09 | 136.619995 | 137.880005 | 135.850006 | 136.009995 | 134.392303 | 76774200 |
2013 rows × 7 columns
train_data["Date"].min(), train_data["Date"].max()
(Timestamp('2013-02-12 00:00:00'), Timestamp('2021-02-09 00:00:00'))
validate_data["Date"].min(), validate_data["Date"].max()
(Timestamp('2021-02-10 00:00:00'), Timestamp('2023-02-09 00:00:00'))
train_data[["Low"]]
| Low | |
|---|---|
| Date | |
| 2013-02-12 | 16.705000 |
| 2013-02-13 | 16.543571 |
| 2013-02-14 | 16.572144 |
| 2013-02-15 | 16.425714 |
| 2013-02-19 | 16.208929 |
| ... | ... |
| 2021-02-03 | 133.610001 |
| 2021-02-04 | 134.589996 |
| 2021-02-05 | 135.860001 |
| 2021-02-08 | 134.919998 |
| 2021-02-09 | 135.850006 |
2013 rows × 1 columns
scaler = StandardScaler() #z = (x - u) / s, u - mean, s - standard deviation, x - each x
scaler.fit(train_data[["Close"]]) #normilize train data
def make_dataset(
df, #данные для создания датасета
window_size, #кол. элементов для предсказания след. элемента
batch_size, #кол. элементов в batch для обучения
use_scaler=True, #использовать ли scaler для нормализации свечей
shuffle=True #смешивать элементы в датасете или нет
):
features = df[["Close"]][:-window_size] #-N предыдущих элементов как фичи, поэтому вычитаем
if use_scaler: #нормализуем фичи, если нужно
features=scaler.transform(features)
data = np.array(features, dtype=np.float32) #приводим данные к нужному типу
ds = tf.keras.preprocessing.timeseries_dataset_from_array( #создание датасета с временным рядом для обучения НС
data=data, #принимает многомерный массив, кот. содержит фичи
targets=df["Close"][window_size:], #принимает лейблы, которые нужно сдвинуть на +N элементов
#т.е. для каждых N элементов из фичей будем иметь N+1 элемент как лейбл
sequence_length=window_size, #принимает длину нашей последовательности window_size
sequence_stride=1, #на сколько нужно сдвигать элемент для создания нового семпла, т.е. предсказываем каждый след. элемент
shuffle=shuffle, #нужно ли перемешивать элементы
batch_size=batch_size #какой размер батча
)
return ds
example_ds = make_dataset(df=train_data, window_size=3, batch_size=2, use_scaler=False, shuffle=False)
example_feature, example_label = next(example_ds.as_numpy_iterator())
example_feature.shape
(2, 3, 1)
example_label.shape
(2,)
train_data["Close"][:6]
Date 2013-02-12 16.710714 2013-02-13 16.678928 2013-02-14 16.663929 2013-02-15 16.434286 2013-02-19 16.428213 2013-02-20 16.030357 Name: Close, dtype: float64
print(example_feature[0])
print(example_label[0])
[[16.710714] [16.678928] [16.663929]] 16.43428611755371
window_size=10
batch_size=8
train_ds = make_dataset(df=train_data, window_size=window_size, batch_size=batch_size, use_scaler=True, shuffle=True)
val_ds = make_dataset(df=validate_data, window_size=window_size, batch_size=batch_size, use_scaler=True, shuffle=True)
train_ds
<BatchDataset element_spec=(TensorSpec(shape=(None, None, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.float64, name=None))>
val_ds
<BatchDataset element_spec=(TensorSpec(shape=(None, None, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.float64, name=None))>
lstm_model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(32, return_sequences=False),
tf.keras.layers.Dense(1)
])
def compile_and_fit(model, train_ds, val_ds, num_epochs: int = 20): #принимает модель, тренировочный датасет, валидационный датасет и кол. эпох для тренировки
model.compile( #сперва модель компилируется
loss=tf.losses.MeanSquaredError(), #передаем функцию потерь, исп. для обучения - Это регресия - предск. вещ. числа - для этой задачи исп. MeanSquaredError
optimizer=tf.optimizers.Adam(), #передаем функцию оптимизации, кот. используется для оптимизации функции потерь
metrics=[tf.metrics.MeanAbsoluteError(), tf.keras.metrics.MeanAbsolutePercentageError()
] #передается массив метрик, для просмотра как происходит обучение, как наши метрики меняются - растут/уменьшаются или остаются на месте
)
history = model.fit( #здесь происходит обучение
train_ds, #передаем тренировочный датасет, на котором будет происходить обучение
epochs=num_epochs, #кол. эпох - сколько раз полностью пройдемся по всему тренировочному датасету - сколько раз мы используем все семплы из тренировочного датасета для обучения
validation_data=val_ds, #передаем те данные, по которым можно расчитывать валидационные метрики - на которых можно понять наша модель переобучается или нет
verbose=0 #выводить ли промежуточную информацию во время обучения
)
return history #объект с помощью которого можно напечатать значения метрик во время эпох
start_time = time.time()
lstm_model = tf.keras.models.Sequential([ #группировка слоев
tf.keras.layers.LSTM(32, return_sequences=False),
tf.keras.layers.Dense(1)
])
history = compile_and_fit(lstm_model, train_ds, val_ds, num_epochs=100)
print("--- %s seconds ---" % (time.time() - start_time))
--- 59.351494789123535 seconds ---
lstm_model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 32) 4352
dense_2 (Dense) (None, 1) 33
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
plt.plot(history.history['mean_absolute_error'])
[<matplotlib.lines.Line2D at 0x253f0e553a0>]
plt.plot(history.history['val_mean_absolute_error'])
[<matplotlib.lines.Line2D at 0x253fd4ea400>]
lstm_model.evaluate(train_ds)
250/250 [==============================] - 1s 2ms/step - loss: 1.3011 - mean_absolute_error: 0.5997 - mean_absolute_percentage_error: 1.2651
[1.3010823726654053, 0.5997085571289062, 1.265061616897583]
lstm_model.evaluate(val_ds)
61/61 [==============================] - 0s 735us/step - loss: 192.9559 - mean_absolute_error: 9.7338 - mean_absolute_percentage_error: 6.0281
[192.95591735839844, 9.73377513885498, 6.028115749359131]
start_time = time.time()
lstm_model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(32, return_sequences=False),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1)
])
history = compile_and_fit(lstm_model, train_ds, val_ds, num_epochs=300)
print("--- %s seconds ---" % (time.time() - start_time))
--- 180.8377878665924 seconds ---
lstm_model.evaluate(train_ds)
250/250 [==============================] - 1s 2ms/step - loss: 1.6492 - mean_absolute_error: 0.8115 - mean_absolute_percentage_error: 1.9823 - mean_squared_error: 1.6492
[1.6491988897323608, 0.8114990592002869, 1.9822752475738525, 1.6491988897323608]
lstm_model.evaluate(val_ds)
61/61 [==============================] - 1s 7ms/step - loss: 122.2312 - mean_absolute_error: 8.7047 - mean_absolute_percentage_error: 5.4921
[122.23119354248047, 8.704742431640625, 5.49210262298584]
plt.plot(history.history['mean_absolute_error'])
[<matplotlib.lines.Line2D at 0x253f3de82b0>]
plt.plot(history.history['val_mean_absolute_error'])
[<matplotlib.lines.Line2D at 0x253f3e12c10>]
import plotly.graph_objects as go
from sklearn.preprocessing import MinMaxScaler
from tensorflow import keras
from tensorflow.keras.layers import Dense,LSTM,Dropout,Flatten
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error
train = train_data['Close'].values #Return a Numpy representation of the DataFrame.
test = validate_data['Close'].values
train
array([ 16.71071434, 16.67892838, 16.66392899, ..., 136.75999451,
136.91000366, 136.00999451])
training_values = np.reshape(train,(len(train),1)) #преобразуем массив в массив с кол-вом элементов равному длине выборки
scaler = MinMaxScaler() #Normilize from 0 to 1
training_values = scaler.fit_transform(training_values)
# assign training values
x_train = training_values[0:len(training_values)-1]
y_train = training_values[1:len(training_values)]
x_train = np.reshape(x_train,(len(x_train),1,1))
x_train[:5]
array([[[0.02138504]],
[[0.02113904]],
[[0.02102296]],
[[0.01924571]],
[[0.01919871]]])
# creates model
model = Sequential()
model.add(LSTM(128,return_sequences=True,input_shape=(None,1)))
model.add(LSTM(64,return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
#compile the model
model.compile(optimizer='adam',loss='mean_squared_error')
# Train the model
model.fit(x_train,y_train,epochs=25,batch_size=8)
Epoch 1/25 252/252 [==============================] - 2s 2ms/step - loss: 0.0056 Epoch 2/25 252/252 [==============================] - 1s 2ms/step - loss: 1.2676e-04 Epoch 3/25 252/252 [==============================] - 0s 2ms/step - loss: 1.0512e-04 Epoch 4/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3132e-04 Epoch 5/25 252/252 [==============================] - 0s 2ms/step - loss: 1.0448e-04 Epoch 6/25 252/252 [==============================] - 0s 2ms/step - loss: 1.2065e-04 Epoch 7/25 252/252 [==============================] - 0s 2ms/step - loss: 1.1018e-04 Epoch 8/25 252/252 [==============================] - 0s 2ms/step - loss: 1.4161e-04 Epoch 9/25 252/252 [==============================] - 0s 2ms/step - loss: 1.4750e-04 Epoch 10/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3625e-04 Epoch 11/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3955e-04 Epoch 12/25 252/252 [==============================] - 0s 2ms/step - loss: 1.4683e-04 Epoch 13/25 252/252 [==============================] - 0s 2ms/step - loss: 1.6056e-04 Epoch 14/25 252/252 [==============================] - 0s 2ms/step - loss: 1.4393e-04 Epoch 15/25 252/252 [==============================] - 1s 2ms/step - loss: 1.2705e-04 Epoch 16/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3358e-04 Epoch 17/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3983e-04 Epoch 18/25 252/252 [==============================] - 0s 2ms/step - loss: 1.7239e-04 Epoch 19/25 252/252 [==============================] - 0s 2ms/step - loss: 1.5268e-04 Epoch 20/25 252/252 [==============================] - 0s 2ms/step - loss: 1.2776e-04 Epoch 21/25 252/252 [==============================] - 0s 2ms/step - loss: 1.2659e-04 Epoch 22/25 252/252 [==============================] - 0s 2ms/step - loss: 1.6346e-04 Epoch 23/25 252/252 [==============================] - 0s 2ms/step - loss: 1.3901e-04 Epoch 24/25 252/252 [==============================] - 0s 2ms/step - loss: 1.1434e-04 Epoch 25/25 252/252 [==============================] - 1s 2ms/step - loss: 1.2799e-04
<keras.callbacks.History at 0x253eb541430>
# assign test and predicted values + reshaping + converting back from scaler
test_values = np.reshape(test, (len(test), 1))
test_values = scaler.transform(test_values)
test_values = np.reshape(test_values, (len(test_values), 1, 1))
predicted_price = model.predict(test_values)
predicted_price = scaler.inverse_transform(predicted_price)
predicted_price=np.squeeze(predicted_price)
16/16 [==============================] - 0s 2ms/step
fig = go.Figure()
fig.add_trace(go.Scatter(x=validate_data['Date'],y=validate_data['Close'],name='Close'))
fig.add_trace(go.Scatter(x=validate_data['Date'],y=predicted_price,name='Forecast_LSTM'))
fig.show()
# evaluate forecasts
mse_lstm = mean_squared_error(test, predicted_price)
print('Test MSE: %.3f' % mse_lstm)
mae_lstm = mean_absolute_error(test, predicted_price)
print('Test MAE: %.3f' % mae_lstm)
Test MSE: 0.291 Test MAE: 0.283
mape_lstm = mean_absolute_percentage_error(test, predicted_price)
print('Test MAPE: %.3f' % mape_lstm)
Test MAPE: 0.002
rmse_lstm = math.sqrt(mean_squared_error(test, predicted_price))
print('Test RMSE: %.3f' % rmse_lstm)
Test RMSE: 0.540
table_predicted = pd.DataFrame(validate_data['Close']) #create DataFrame with predict date
table_predicted['Predicted_price'] = predicted_price
table_predicted
| Close | Predicted_price | |
|---|---|---|
| Date | ||
| 2021-02-10 | 135.389999 | 135.323959 |
| 2021-02-11 | 135.130005 | 135.060410 |
| 2021-02-12 | 135.369995 | 135.303680 |
| 2021-02-16 | 133.190002 | 133.093719 |
| 2021-02-17 | 130.839996 | 130.712158 |
| ... | ... | ... |
| 2023-02-03 | 154.500000 | 154.444962 |
| 2023-02-06 | 151.729996 | 151.724976 |
| 2023-02-07 | 154.649994 | 154.591522 |
| 2023-02-08 | 151.919998 | 151.912323 |
| 2023-02-09 | 150.869995 | 150.875641 |
504 rows × 2 columns
Meta = yf.download("META", start="2001-01-01", end="2023-02-08")
[*********************100%***********************] 1 of 1 completed
new_df = Meta.filter(['Close'])
new_df
| Close | |
|---|---|
| Date | |
| 2012-05-18 | 38.230000 |
| 2012-05-21 | 34.029999 |
| 2012-05-22 | 31.000000 |
| 2012-05-23 | 32.000000 |
| 2012-05-24 | 33.029999 |
| ... | ... |
| 2023-02-01 | 153.119995 |
| 2023-02-02 | 188.770004 |
| 2023-02-03 | 186.529999 |
| 2023-02-06 | 186.059998 |
| 2023-02-07 | 191.619995 |
2698 rows × 1 columns
last_60_days = new_df[-60:].values
X_test = scaler.fit_transform(last_60_days)
X_test = np.reshape(X_test, (X_test.shape[0],X_test.shape[1], 1))
pred_price = model2.predict(X_test)
pred_price = scaler.inverse_transform(pred_price)
print(pred_price)
2/2 [==============================] - 0s 2ms/step [[112.409615] [113.52345 ] [114.686676] [117.46263 ] [113.72695 ] [112.00305 ] [112.58391 ] [110.46506 ] [111.99338 ] [112.76788 ] [111.96434 ] [109.42144 ] [110.07842 ] [118.45381 ] [120.72981 ] [123.70034 ] [122.667496] [114.5897 ] [114.40547 ] [115.763466] [116.31671 ] [115.16191 ] [120.44759 ] [121.84936 ] [116.559425] [119.74709 ] [114.93883 ] [117.47235 ] [120.06813 ] [117.50149 ] [118.3955 ] [117.26834 ] [116.04492 ] [120.55464 ] [120.63249 ] [124.91892 ] [127.48468 ] [127.065025] [130.07214 ] [129.53496 ] [132.97409 ] [132.87634 ] [136.61089 ] [136.8749 ] [135.29088 ] [133.0034 ] [136.0633 ] [139.2121 ] [143.02588 ] [142.89877 ] [141.29509 ] [146.9653 ] [151.30176 ] [146.73076 ] [148.59692 ] [152.64844 ] [187.02283 ] [184.89705 ] [184.45024 ] [189.71873 ]]
Meta = yf.download("META", start="2023-02-07", end="2023-02-09")
print(Meta['Close'])
[*********************100%***********************] 1 of 1 completed Date 2023-02-07 191.619995 2023-02-08 183.429993 Name: Close, dtype: float64
aapl = yf.Ticker("AAPL")
# get stock info
aapl.info
# get historical market data
hist = aapl.history(period="max")
# show actions (dividends, splits)
aapl.actions
# show dividends
aapl.dividends
# show splits
aapl.splits
# show financials
aapl.financials
aapl.quarterly_financials
# show major holders
aapl.major_holders
# show institutional holders
aapl.institutional_holders
# show balance sheet
aapl.balance_sheet
aapl.quarterly_balance_sheet
# show cashflow
aapl.cashflow
aapl.quarterly_cashflow
# show earnings
aapl.earnings
aapl.quarterly_earnings
# show sustainability
aapl.sustainability
# show analysts recommendations
aapl.recommendations
# show next event (earnings, etc)
aapl.calendar
# show ISIN code - *experimental*
# ISIN = International Securities Identification Number
aapl.isin
# show options expirations
aapl.options
# get option chain for specific expiration
#opt = aapl.option_chain('YYYY-MM-DD')
# data available via: opt.calls, opt.puts
#https://habr.com/ru/post/485890/
#https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21